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ABSTRACT 

* This paper reports on a whole-school staff 

development program, field-tested and developed by the Appalachia 
Educational Laboratory, Inc. The program called QUILT (Questioning 
and Understanding To Improve Learning and Thinking) was designed to 
improve the questioning skills of classroom teachers by helping them 
individually and with colleagues. QUILT includes extensive data 
.collection and analysis to assess effectiveness including assessment 
of participant knowledge, attitudes, and classroom behaviors. The 
behaviors of interest include: number of teacher and student 
initiated questions, use of wait-time, cognitive levels of questions 
and student answers, manner of designating students to answer 
questions, and use of various types of desirable and undesirable 
teacher responses or feedback. The aspect of QUILT examined here, the 
evaluation of teacher classroom questioning behaviors, is based on 
classroom observation, pre- and post-QUILT assessment of teacher 
kn owl edge , at t i tudes , classroom questioning practices, videotaped 
observation and coding of teacher behaviors. Data analysis revealed 
strong evidence that teacher behavior is positively influenced by the 
QUILT program. (Contains 19 references . ) (LL) 
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The teacher evaluation activity reported here is an integral part 
of a staff development program developed by the Appalachia Educational 
Laboratory, Inc. (AEL) called QUILT, designed to improve classroom 
teacher questioning skills. QUILT includes extensive data collection 
and analysis to assess effectiveness including assessment of 
participant knowledge, attitudes, and classroom behaviors. The aspect 
of QUILT research reported here is the evaluation of teacher classroom 
questioning behaviors based on classroom observation. 

Research indicates that as much as 40% of classroom time is spent 
in a question-response mode (Johnson, Markle, & Haley-Olpihant , 1987). 
Nevertheless, many teachers do not ask questions effectively (Gall, 
1984) . Ineffective or inappropriate practices include asking questions 
at only lower cognitive levels (Ornstein, 1987) , directing a 
disproportionate percentage of questions toward a limited number of 
students (Jones, 1990), or waiting too little time after asking a 
question before reacting to the student response, typically one second 
or less (Rowe, 1986) . Questions too often flow in only one direction 
and become a way of maintaining control rather than stimulating 
thought. For example, teachers are likely to ask as many as 50 
questions during a typical class period while it is unlikely that the 
students in the class ask even one question (McGlathery, 1978) . 



QUILT Overview 

QUILT is a staff development program for classroom teachers. Its 
goal is to provide a focal point for whole-school staff development by 
helping teachers, individually and with colleagues to improve their 
skills in asking questions, a teaching strategy used in all subjects K- 
12. QUILT stands for Questioning and Understanding to Improve Learning 
and Thinking. It was designed and field tested by the Appalachia 
Educational Laboratory (AEL) of Charleston, WV, in collaboration with 
school personnel in Kentucky, North Carolina, Tennessee, Virginia, and 
West Virginia. 

QUILT has four components: induction training, collegiums, 
partnering, and independent study and analysis. Induction training is 
a three-day, 18-hour program, conducted by trainers trained by AEL, 
where participants are provided research-based knowledge and theory, as 
well as frequent opportunities to practice effective questioning 
techniques. Experience indicates that the three-day induction period 
produces a degree of bonding among participants that does not occur 
during shorter sessions. The selection of QUILT as the program 
acronym had a major affect on development of participant bonding. 
Not only have the program components and training been designed around 
the development of a quilt, but the stories about family quilts shared 
by participants in getting acquainted sessions have resulted in within 
group personal cohesiveness rarely seen in staff development. 

During the school year, teachers and administrators meet seven 
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times in forums designed to review information about questioning and 
reinforce changes in teacher questioning behaviors. These are referred 
to as collegiums and, although they are open-ended, each of these 
focuses on greater understanding and reinforcing particular questioning 
skills and behaviors. Partnering involves teams of peer teachers in 
ongoing, mutual support activities within the schools. These 
activities include visiting each other's classrooms to. observe and 
monitor progress in questioning and to provide support and 
encouragement. Throughout the year participants read independently, 
practice their skills, and compile data on their own classroom 
behaviors and student responses. 

QUILT differs in significant ways from the approaches to staff 
development most frequently employed by schools. First, QUILT treats 
staff development as a long term commitment. Research indicates that 
only a small percentage of teachers, perhaps as low as ten percent, 
change their behavior in response to a training program unless lectures 
or seminars are reinforced by feedback- in a classrooiu setting (Joyce & 
Showers, 1982) . QUILT is a multi-year program and the "partnering" 
approach to classroom instruction is central to its design. 

Second, QUILT represents a "whole-school" approach to staff 
development. Because questioning is a generic educational activity, 
improving questioning skills is relevant across curriculum from 
kindergarten to the 12th grade. The partnering approach reflects this 
generic quality since teachers across subject areas can work together 
to improve questioning skills. 

Third, QUILT is student-centered. While it is fashionable to make 
this claim for almost any program, the entire purpose of the QUILT 
five-stage model is to stimulate student thinking, particularly higher 
order thinking. As Dillon (1984) states, "to conceive an educative 
question requires thought; to formulate it requires labor; and to pose 
it, tact." The QUILT five-stage model helps teachers view questioning 
as a process which begins with planning the question and ends with - 
reflectively evaluating the effectiveness of the questioning episode. 
More specifically, the five stages are as follows. Stage 1 relates to 
preparing the question. It includes identifying the instructional 
purpose, determining the content focus, selecting the appropriate 
cognitive level, and considering wording and context. Stage 2 relates 
to presenting the question. It includes indicating the response 
format, asking the question, and respondent selection. Stage 3 relates 
to prompting student response. It includes pausing after asking the 
question, assisting nonrespondents, and pausing after student response. 
Stage 4 relates to processing the student response. It includes 
provision of appropriate feedback, expanding and using correct 
responses, and eliciting student reactions and questions. Stage 5 
relates to critiquing the questioning episode. It relates to analyzing 
the question, mapping respondent selection, evaluating student response 
patterns, and examining teacher and student reactions. 



During the field test year, QUILT was implemented in a manner 
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which permitted assessment of its effectiveness. Three levels of 
implementation of QUILT components were initiated. Schools were 
randomly assigned into one of the three groups. Group A schools 
completed the full QUILT program which included three-day induction, 
collegiums, partnering, and other independent study activities. Group 
B schools completed only the three-day induction program and Group C 
schools received only a three-hour orientation session related to QUILT 
questioning concepts. 

An extensive research design was developed to assess QUILT 
effectiveness (Barnette & Sattes, 1991) . This included pre and post- 
QUILT assessment of teacher knowledge, attitudes, and classroom 
questioning practices. In addition, evaluation of all aspects of 
program delivery and implementation was conducted. The aspect of QUILT 
research reported here is related to the evaluation of teacher 
classroom questioning practices based on the videotaped observation and 
coding of teacher behaviors. 



Development of the Classroom Questioning Observation Instrument 

The Classroom Questioning Observation Instrument (CQOI) was 
developed as one data-gathering tool needed for the QUILT research 
design. Its primary purpose is to collect specific information on 
teachers' classroom questioning behaviors with the data being used to 
help analyze teacher classroom questioning behavior change. 
Specifically, it was designed to address one of the QUILT research 
hypotheses (Barnette & Sattes, 1991) , namely: 

There will be a significant difference between the three groups 
on the dependent variables related to teacher classroom ques- 
tioning behavior, as measured by the Classroom Questioning 
Observation Instrument. These differences will be directional in 
nature, with Condition A having the highest level of desirable 
behaviors, Condition B having the second highest, and Condition C 
having the lowest level of desirable behaviors. 

More specifically, the behaviors of interest included: number of 
teacher and student initiated questions, use of wait-time I, use of 
W ait-time II, cognitive levels of questions and student answers, manner 
of designating students to answer questions, and use of various types 
of desirable and undesirable teacher responses or feedback. 

Because participating teachers were spread out over five states 
and in an attempt to reduce obtrusiveness of an actual observer in the 
classroom, it was decided to have 15-minute videotapes recorded, which 
would be reviewed and coded by trained coders. The CQOI is a low 
inference, multiple code, category system observation instrument. As 
such, it was designed and developed with these factors in mind: 

1. a format which provided for ease of data collection and 
analysis, 
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2. clearly stated definitions to increase coder reliability and 
inter-rater agreement, 

3* a direct connection to the QUILT materials, and 

4. the desired outcomes, both in terms of research and the 
usability of information to teachers participating in the 
QUILT program. 

Dr. Debra Sullivan, the CQOI developer, used prior knowledge of 
other classroom observation instruments, QUILT materials, and classroom 
visits to design the instrument. Throughout the instrument's formative 
stages, of development, the developer visited classrooms and collected 
data using draft versions of the instrument. Using this process, not 
only was it possible to assess specific research questions, but "real 
life" usability in classroom situations was assured. Meetings were 
held with AEL staff to ensure a match between the research design and 
the teacher behavior data collection device. Points raised at these 
meetings were used to modify the CQOI, increasing the level of content 
validity. 

For logistic reasons, it was decided to have all coders living in 
the Charleston area. Four middle school and high school teachers were 
selected by the CQOI developer to participate in coder training. All 
of the selected teachers were considered extremely capable and 
competent teachers who represented several major curricular areas 
including language arts, social studies, mathematics, science, and 
foreign language. 

Coders were trained using a variety of methods including group 
sessions as well as independent work. During the training sessions, 
coders: 

1. were acquainted with the QUILT program and its research 
design, 

2. were familiarized with the CQOI in terms of format, 
definitions, and manner of completion, 

3. practiced coding transcripts of classroom sequences featuring 
questioning interactions between teachers and students, and 

4. practiced coding videotapes of classroom episodes. 

Similarly, the independent work completed by coders focused on written 
transcripts of classroom interactions as well as classroom videotapes. 
During the coder training, CQCI codes and their definitions were 
discussed and defined more clearly, thus ultimately assuring higher 
levels of coder validity and reliability. 

Since 15-minute videotapes of classroom teaching episodes were 
used rather than direct observation, coder speed was not an area of 
concern. Coders were able to replay the tape to check coding for 



ERLC 



6 



5 



accuracy and reconsideration. Therefore, only accuracy in coding 
classroom questioning behaviors was necessary to determine coder 
reliability. Reliability was established by comparing coder responses 
with those of the CQOI developer on the same videotape. The range of 
agreement of coding ranged from 90 to 94%, with an average agreement of 
92%. Coders did not know the teachers who were observed, nor did they 
know which QUILT condition they represented. 

The CQOI permits the, mostly linear, coding of many teacher 
behaviors and characteristics of questioning in the classroom. Each 
questioning episode is recorded in terms of whether it was teacher or 
student initiated. For teacher initiated questions, whether the 
teacher designated a student to answer before or after asking the 
question is then recorded. The level of question is recorded as being 
recall, check for understanding, utilization, or creation. Wait-time 
I, the time a teacher waits before acknowledging a student response to 
an initial question, is recorded by checking the number of seconds. 
The student answering, whether the one designated before or after the 
question was asked, is recorded. The number of students responding is 
recorded as one, more than one, or whole class (choral response) . The 
level of student answer is recorded as being recall, check for 
understanding, utilization, or creation. The student answer is also 
recorded as being correct, partially correct, wrong, no answer, 
inappropriate response and if the student asks for clarification or 
extends his/her answer. 

Wait-time II is recorded, the time the teacher waits before 
reacting to the student answer. The teacher reaction is recorded as 
being positive feedback, praise, negative feedback, corrective 
feedback, criticism, or no feedback given. In addition, other teacher 
behaviors are recorded including whether the teacher probes, repeats or 
rephrases the question, repeats or rephrases the student answer, uses 
the student response in discussion or new questions, and/ or redirects 
the question to other students. 



Data Analysis 

It is beyond the scope of this paper to present all of the 
results of the comparison of pre-QUILT and post-QUILT classroom 
behaviors across the three QUILT training conditions. This analysis is 
limited to seven of the more important variables measured by coding of 
the observations prior to the start of QUILT (referred to as pre) and 
again at the end of the first complete year of QUILT operation 
(referred to as post) . Three different groups were included in this 
analysis: 

Condition A (Full QUILT model including induction and collegiums) 
Condition B (QUILT induction without collegiums) 
Condition C (QUILT awareness workshop) 
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The data were analyzed using several programs of the SAS package 
(SAS is a registered trademark of SAS Institute Inc., Cary, NC) . 
For each QUILT variable, the following analyses were conducted: 

1. Univariate summary statistics were computed for pre test 
results, post test results, and post-pre test results. 
Included were tests for normality and provision of data for 
computation of Lm ax statistics for checking analysis of 
variance assumptions. These results were used to compute 
effect sizes. The pre test standard deviation for participant 
scores in all three groups was used as the base for the effect 
size* The post test minus pre test means were divided by the 
overall pretest standard deviation to obtain the effect size. 

2. The GLM procedure was conducted as a mixed design, with a 
between subjects factor (condition) and a within subjects 
factor (testing time) . Of primary concern were two planned 
follow-ups of the interaction. Since these comparisons were 
in the planned mode, the significant interaction of condition 
and time was not required to conduct these follow-ups. 

3. The first follow-up procedure involved the comparison of 
pre and post test means within each condition. These were 
compared using directional, dependent t tests with alpha set 
at 0.05. 

4. The second follow-up procedure involved the comparison of 
post test means of condition A with each of the other groups 
(A with B and A with C) . These were compared using 
directional, Dunnett t tests with alpha set at 0.05. The 
Dunnett is specifically designed to compare groups with a 
control group or the situation where all groups are compared 
with only one other group. In this case condition A was 
compared with each of the other groups. Dunnett controls Type 
I error rate in an experiment-wise manner. It is one of the 
few planned follow-up procedures which can be used to test 
directional hypotheses. Thus, it has high statistical power, 
but is limited to the number of groups, minus one, pairwise 
comparisons. 

5. The third follow-up procedure involved the comparison of the 
pre to post test change mean of condition A with each of the 
other groups (A with B and A with C) . These were compared 
using directional, Dunnett t tests with alpha set at 0.05. 



Results 

Number of Teacher Questions 

During the 15 minute video tape, the number of teacher initiated 
questions was recorded. The desirable change was that there be a 
decrease on this variable. Results for this variable are presented in 
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Table 1. All three groups had reductions in the number of teacher 
questions. This reduction was significant for conditions A and C, with 
effect sizes of -0.65 for condition A and -0.44 for condition C. 



Table 1. Number of Teacher Questions by QUILT Condition 

Significant Group 
Differences with 
Mean/SD Condition A at 



QUILT Group 


Pre 


Post 


ES 


P 


Post 


Change 


Condition A 
n= 37 


41.4 
15.8 


31.0 
14.5 


-0. 65 


<0. 001 


A<B 


none 


Condition B 
n= 28 


44.9 
17.4 


40.5 
13 .8 


-0.27 


nsd 






Condition C 
n= 3 0 


43.3 
15. 5 


36.3 
14.4 


-0.44 


<0. 05 







At post, condition A had a significantly lower number of teacher 
initiated questions than condition B. There were no significant 
differences between condition A and conditions B or C relative to the 
degree of change between pre and post test, although the difference was 
in the predicted direction. 



Wait-Time I 

Wait-time I is the time a teacher waits after asking a question 
before acknowledging or reacting to a students response. It is 
recommended that this time be three seconds or longer. Results for 



Table 2. Wait-time I, Percentage at Three or More Seconds 
by QUILT Condition 



Mean/SD 



Significant Group 
Differences with 
Condition A at 



QUILT Group 


Pre 


Post 


ES 


P 


Post 


Condition A 
n= 37 


12 .8 
11.9 


25.0 
24 .9 


+ 0.99 


<0.01 


A>C 


Condition B 
n= 28 


11.1 
10. 1 


20.7 
19.5 


+0.78 


<0.01 




Condition C 
n= 30 


10. 1 
14.8 


11.5 
16.5 


+0. 11 


nsd 





A>C 
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this variable are found in Table 2. It was predicted that this 
variable would increase as a result of QUILT. Both conditions A and B 
had significant increases in this variable, with effect sizes of +0.99 
for condition A and +0.78 for condition B. Condition A had a higher 
mean at post as well as significantly more pre to post change as 
compared with condition C. 



Wait-time II 

Wait-time II is the time a teacher waits after a student's 
response to a question before acknowledging or reacting to that 
response. It is recommended that this time be three seconds or longer. 
Results for this variable are found in Table 3. It was desired that 

Table 3. Wait-time II, Percentage at Three or More Seconds 
by QUILT Condition 

Significant Group* 
Differences with 



QUILT Group 


Mean/SD 
Pre Post 


ES 


P 


Condition A at 
Post Change 


Condition A 


0. 52 


2 .98 


+1.72 


<0.05 


A>B A>C 


n= 37 


1.28 


6.73 






A>C 


Condition B 


0. 10 


0.59 


+0.34 


nsd 




n= 28 


0.51 


1.61 








Condition C 


0.59 


0.97 


+0.26 


nsd 




n= 30 


2 . 06 


4.57 









this variable increase. While the level of the use of wait-time II is 
very low, there was a significant pre to post change for condition A, 
effect size of +1.72. At post, condition A was significantly higher 
than both conditions B and C. Condition A had a significantly higher 
pre to post change as compared with condition C. 



Teacher Questions Above Recall Levels 

The percent of times teacher initiated questions that were above 
recall cognitive level was determined and results presented in Tabic 4. 
An objective of QUILT training is to increase the frequency of higher 
level questioning. Condition A was the only group to have a 
significant pre to post change, with an effect size of +0.43. 
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Table 4. Cognitive Level of Question, Percentage above Recall Level, 
by QUILT Condition 



Mean/SD 



Significant Group 
Differences with 
Condition A at 



QUILT Group 


Pre 


Post 


ES 


P 


Post 


Condition A 
n= 37 


31.0 
23.3 


41.2 
27.8 


+0.43 


<0.05 


none 


Condition B 
n= 28 


41.0 
24.8 


39.2 
30.1 


-0.07 


nsd 




Condition C 
n= 30 


26.3 
22.0 


32.0 
22 .7 


+0.24 


nsd 





Percentage of T ime Teacher Redirects Question to Other Student (s) 

The percentage of times a teacher redirects a question to other 
student (s) was determined and is reported in Table 5. A QUILT 
objective was for this to increase. Condition A had a significant pre 



Table 5. Question Redirected to Another Student 
Percentage by QUILT Condition 

Significant Group 
Differences with 
Mean/SD Condition A at 

QUILT Group Pre Post ES p Post Change 



Condition 
n= 37 


A 


14. 1 
14.5 


23 .2 
19.9 


+0. 59 


<0. 01 


A>C 


A>B 
A>C 


Condition 
n= 28 


B 


20.6 
16.7 


19.4 
14.9 


-0. 08 


nsd 






Condition 
n= 30 


C 


18. 1 
15.0 


12 . 3 
14.5 


-0. 37 


nsd 







post change with an effect size of +0.59. At post, condition A had a 
significantly higher mean than condition C and condition A had 
significantly higher pre to post change than both conditions B and C. 



Percentage of Time Student Designated to Answer After Question Asked 

The percent of times the teacher designated which student was to 
answer a question after it was asked was determined and results 
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presented in Table 6. It was a QUILT objective to increase this 
practice because often when the student is designated prior to the 
question rather than after the question, other students, since they 
feel they are not involved, reduce their level of involvement or 
discontinued involvement in the interaction totally. Both 



Table 6. Student Designated after Question 
Percentage by QUILT Condition 



Mean/SD 



Significant Group 
Differences with 
Condition A at 



QUILT Group 


Pre 


Post 


ES 


P 


Post 


Change 


Condition A 
n= 37 


84.1 
12.8 


90.8 
9.3 


+ 0.39 


<0.01 


A>B 


none 


Condition B 
n= 28 


83.1 
23.2 


85.3 
11.2 


+ 0.13 


nsd 






Condition C 
n= 30 


83.5 
14.4 


89.4 
11.2 


+ 0.35 


<0.05 







conditions A and C had significant pre to post changes, with the effect 
size for condition A at +0.39 and for condition C at +0.35. At post, 
the condition A mean was significantly higher than condition B. 



Percentage of Time Teacher Repeats Student Answer 

Another variable, which QUILT was designed to decrease was the 
percentage of time a teacher repeats the student answer. Often when 



Table 7 . Teacher Repeats Student Answer 
Percentage by QUILT Condition 

Significant Group 
Differences with 
Mean/SD Condition A at 



QUILT Group 


Pre 


Post 


ES 


P 


Post 


Change 


Condition A 
n= 37 


62.4 
18.9 


54.6 
28.5 


-0.43 


<0.05 


none 


none 


Condition B 
n= 28 


60.5 
14.3 


55.9 
17.9 


-0.25 


nsd 






Condition C 
n= 30 


59.4 
20.9 


61.5 
25.5 


+0.11 


nsd 
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this happens other students take this as acknowledging the response as 
being correct and then there is no need to continue thinking. If the 
teacher gets the answer , students often tune-out. Table 7 presents the 
results for this variable. Condition A was the only one to have a 
significant pre to post reduction in this behavior, with an effect size 
of -0.43. 



Summary of Differences Between Condition A and Other Groups at Post 

Condition A had a significantly lower number of teacher questions 
at post than condition B. It had a significantly higher percentage of 
wait-time I of three seconds or higher as compared with condition C. 
Condition A had significantly higher percentage of wait-time II at 
three seconds or more than either of the other two conditions. 
Condition A had a significantly higher percentage of times the teacher 
redirects the question to other student (s) as compared with condition 
C. Condition A had a significantly higher percentage of times the 
teacher designated the student to answer after asking the question as 
compared with condition B. 

Summary of Differences Between Pre-Post Chancre of Condition A and Other 
Groups 

Condition A had significantly more desirable pre to post change 
than condition C on the variables of wait-time I, wait-time II, and 
question redirection to other student(s). Condition A had significant- 
ly more desirable pre to post change than condition B on the variable 
of question redirection to other student (s) . 



Conclusions 

It is clear that condition A had the greatest degree of change in 
predicted and/or desirable directions. On all seven of the selected 
variables of the CQOI, there was a significant pre to post change for 
condition A, compared with one change for condition B, and two such 
changes for condition C. On post comparisons, the condition A mean was 
more favorable than the condition B mean on three of the variables and 
more favorable than the condition C mean on three of the variables. 
Condition A had higher positive pre to post change than condition B on 
one of the variables and higher than condition C on three of the 
variables. Since these teachers were randomly selected and assigned to 
the three treatment conditions, there is strong evidence that teacher 
behavior has been positively influenced by the QUILT program. The use 
of observational data provides information helpful in answering the 
questions related to the impact of staff development in terms of going 
beyond just learning more about classroom questioning, but being able 
to apply learning to the actual classroom situation. The CQOI provided 
a reliable method of evaluating teacher classroom questioning behaviors, 
yet was relatively unobtrusive and valid. 
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